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1.  Introduction 


All  aspects  of  information  are  increasingly  vital  to  U.S.  force  employment,  with  measures  and 
countermeasures  involving  digital  information  becoming  as  important  as  actual  weapons 
systems.  Battlespace  digitization  involves  rapid  transfer  of  data  among  sensors,  intelligence 
officers,  commanders,  and  weapons.  Our  Southwest  Asian  conflicts  show  that  modem  warfare 
requires  improved  knowledge  processing.  Vast  quantities  of  disparate  information  can  be 
utilized  via  intelligent  systems.  Information  is  routinely  computerized  syntactically  and 
semantically.  More  sophisticated  processing  systems  could  include  fonnal  reasoning  and 
associative  connections. 

We  envisage  this  note  as  one  of  a  series  of  notes  leading  to  a  unified  approach  for  building  a 
model  of  battlespace  information  mediation,  a  term  which  connotes  conveyance  via  intermediary 
mechanisms  and  includes  notions  of  filtering,  summarization,  fusion,  and  inference.  We  intend 
to  develop  approaches  involving  the  following: 

•  division  of  decision-making  responsibility  in  complex  real-time  processors 

•  self-organizing  cognitive  software  systems  encoding  knowledge 

•  propagation  of  information  through  inference  nets 

•  selective  querying  internal  to  the  system  based  on  perceived  utility  to  the  reasoning  being 
performed 

•  consideration  of  cognitive  constructs  as  vector-algebraic  objects  to  be  manipulated 
symbolically  (including  derivation  of  measures  of  divergence  from  expectations  and  hence 
detection  of  deception) 

•  calculations  of  the  values  of  weapons  and  tactics  (indeed,  of  information  itself)  during  a 
conflict  based  on  actual  battlespace  parameters 

The  purpose  of  this  note  is  to  generate  discussion  on  information  fusion  modeling  for  research 
and  hypothesis  testing.  As  such,  it  does  not  represent  our  final  thinking.  Certain  formulations 
are,  perhaps,  too  simplistic;  but  this  is  just  an  initial  approach  to  studies  that  could  utilize  realistic 
data  in  battle  command  testbeds.  The  work  is  intended  to  encourage  network  scientists  of  the 
Directorate,  researchers  qualified  as  analysts  of  tactical  information  concerns.  Network  theory 
(especially  as  applied  to  infonnation  fusion)  appears  to  be  in  a  relatively  primitive  state  of 
development,  and  scaling  what  solutions  do  exist  into  postulated  battlespace  requirements  is 
poorly  understood.  This  work  is  a  modest  start  on  back-of-the-envelope  analyses  of  battlespace 
digitization  and  knowledge  fusion  problems. 
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We  consider  several  elementary  concepts  and  extensions,  including  the  following: 

•  amount  and  value  of  information 

•  processor  level  and  rate 

•  process  rules  (e.g.,  separation,  consolidation) 

•  infonnation  decay 

•  process  tasks,  comprising  subtasks  that  may  interact 

•  completion  time  and  accuracy 

•  time-independent  degradation  factors  that  affect  subtasks 

•  time-dependent  stress  function 

•  state  characteristic 

•  efficiency 

It  is  apparent  that  such  investigations  lend  themselves  to  leveraging  design  techniques  for 
computer  operating  systems  and  search  engines;  consider  such  methods  in  the  context  of  what 
follows. 

Several  assumptions  tend  to  arise  based  on  tactical  considerations.  For  instance,  most  data  will 
be  generated  at  the  lower  levels  and  most  data  will  not  result  in  information  at  the  higher  levels. 
Similarly,  information  processing  should  generally  be  pushed  down  to  the  lowest  level  possible, 
in  turn  minimizing  higher-level  overload.  Moreover,  in  analyzing  the  ability  to  get  data  to  where 
it  is  needed,  bandwidth  considerations  are  important.  We  hope  that  the  processing  model 
developed  from  theoretical  constructs  we  are  examining  can  be  used  to  address  such  assumptions 
and  yield  at  least  qualitatively  meaningful  insights. 

Given  certain  theoretical  premises  about  information  in  a  real-time  system,  the  overall  problem  is 
one  of  detennining  types  and  connections  of  processors  at  various  levels  to  maximize  utilizable 
information.  Related  problems  involve  minimizing  time  through  the  net  and  minimizing  process 
cost.  In  considering  using  low-level  data  by  high-level  processors  and  production  and  utility  of 
second-order  facts,  we  are  interested  in  developing  results  analytically  where  possible.  Another 
aspect  of  this  is  conceptual:  for  instance,  second-order  facts  may  connote  disambiguation  of 
sensor  input  on  the  one  hand  or  reasoned  knowledge  on  the  other.  This  note  utilizes  differential 
equations  as  a  modeling  technique;  however,  other  methodological  approaches  will  be  alluded  to. 
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2.  A  Paradigm 


We  are  interested  in  representing  development  of  metainformation  based  on  an  accumulation  of 
information  within  a  network  of  communicating  nodes.  Observations  could  be  related  in  many 
ways:  no  relation,  sibling,  parent,  child,  includes,  included-in,  equals,  same-level,  etc.  To 
illustrate  some  characteristics  of  an  information  network,  consider  the  following  situation. 
Processors  (devices  for  filtering,  summarization,  fusion,  and  inference)  receive  raw  (possibly- 
related)  data  from  the  environment.  Each  accumulates  a  database  and  processes  it  into  first-order 
facts  (possibly  at  a  rate  proportional  to  that  of  data  reception)  with  the  object  of  producing 
second-order  facts  for  a  higher  processor  level.  Queries  and  responses  take  place  among 
processors.  Information  utility  may  decay  with  time,  and  there  may  be  information  loss  if 
processors  or  networks  are  overloaded.  As  first-order  facts  are  formed,  gaps  are  realized  in  the 
formation  of  second-order  facts.  Queries  are  made  to  other  processors,  possibly  at  a  rate 
proportional  to  the  first-order  fonnation  rate.  As  second-order  facts  are  completed,  the  processor 
moves  them  to  the  next  level. 

Less  abstractly,  we  might  consider  that  to  know  the  battlespace  is  to  know  the  characteristics  of 
all  units  at  all  times.  Each  unit  comprises  subunits  (down  to  some  smallest)  having  various 
properties  such  as  position,  velocity,  strength,  and  attachment  as  functions  of  time.  Sensors  pick 
up  characteristics  of  the  smallest  subunits,  and  processors  combine  the  information.  Information 
itself  has  properties  such  as  veracity,  timeliness,  and  applicability.  Processor  mechanisms 
include  simplification  (e.g.,  filtering),  consolidation  (e.g.,  averaging  sensings  for  a  unit  position), 
and  separation  (e.g.,  observations  by  a  single  sensor  may  yield  information  for  distinct 
processors). 

One  way  of  conceptualizing  processor  types  involves  the  real-world  notion  of  echelonment.  The 
lowest-level  processor  might  be  associated  with  brigade  combat  team  (BCT)  information  and  be 
interested  only  in  the  recent  past  as  it  reflects  fast-moving  local  battles.  Higher-level  processors 
might  be  associated  with  higher  echelons  and,  in  developing  an  overall  battlespace  picture, 
would  accumulate  an  historical  database. 

Processing  increases  pragmatic  content,  and  infonnation  value  depends  on  level  of  application. 
Moreover,  information  quality  may  be  related  to  amounts  obtained  from  different  sources. 
Second-order  development  calculi  must  account  for  interactions  among  information  amount, 
utility,  and  content.  For  instance,  not  all  facts  generated  heuristically  are  useful  or  even 
meaningful.  Processing  can  certainly  yield  additional  facts;  whether  it  is  able  to  increase  the 
amount  of  information  existing  in  the  raw  data  depends  on  one’s  semantics.  A  higher-order  fact 
will  generally  be  of  greater  utility  at  its  level  than  are  the  component  data,  even  though  in  an 
information-theoretic  sense  there  may  be  less  information. 
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An  important  and  difficult  problem,  then,  is:  given  time -varying  observations  of  the  smallest 
subunits,  what  is  the  optimal  configuration  of  processing  mechanisms?  We  might  choose 
between  the  realms  of  data  mining  (pattern-recognition,  statistical  summarizations)  on  the  one 
hand,  and  of  knowledge  discovery  (inference  without  preconceived  notions,  development  of 
interesting  aspects  of  the  raw  data)  on  the  other.  Without  concerning  ourselves  with  such 
(somewhat  overlapping)  distinctions  at  this  point,  we  proceed  with  developing  an  abstract 
paradigm. 


3.  A  Differential  Formulation 


We  consider  the  situation  in  figure  1 : 


Let  /,  =  information  that  node  i  has  obtained,  raw  input 

Pi  =  information  node  i  is  processing,  partial  data 
Oi  =  output  facts  from  node,  i  fully  formed  conclusions 
Qi  =  queries  from  node  i 
Aj  =  answers  from  node  i 
dl 

— =  forcing  function  (of  battlespace,  sensor) 
dt 

dP 

Note  all  these  are  functions  of  time.  Moreover,  — L  is  a  function  of  P,  and  Qj,  Qi,  and  O,  are 

dt 

functions  of  P,  and  Aj,  and  A,  is  a  function  of  Ph  0„  and  Qj. 

Assume  (initially)  there  is  no  redundant  information  and  that  all  information  is  correct.  We  are 
interested  in  the  movement  of  information  from  input,  through  the  processing,  to  output:  Ot(t)  is 
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desired.  Moreover,  we  are  concerned  only  with  continuous  amounts  of  information;  this 
formulation  does  not  consider  that  discrete  information  is  buffered  in  reality. 

dP 

As  a  start  at  modeling  a  processor,  we  consider  that  — -  -  f(It,A2,Pt).  That  is,  we  assume  the 

dt 

output  rate  is  a  function  of  the  amount  being  processed  and  we  ignore  (for  the  time  being)  the 
impact  of  outside  queries  (as  opposed  to  answers)  on  the  processing.  To  limit  the  rate  we  would 

probably  like  — —  to  increase  monotonically  and  asymptotically  as  Pi  increases.  In  relating 
dt 

queries  and  answers  we  note,  for  a  simple  linear  formulation,  that  (Q,  may  be  a  fraction  of  Pj  and 
be  based  on  /,  and  Aj,  and  that  A,  may  be  a  fraction  of  Pt  and  be  based  on  Q,.  Pending  resolution 
of  whether  query  rate  can  be  considered,  to  a  first  approximation,  a  linear  function  of  the  amount 
being  processed  we  can  now  write  a  differential  fonnulation  of  such  a  system  as: 


iP,=I,  +  Ai+Q.-0, 


dO, 

dt 


i1-^)  d7r=<I^A>) 


dt 


dt  dt  >• 


(1) 


Of  course,  the  situation  leading  to  this  formulation  is  simplistic.  The  main  purpose  of  this  note  is 
to  sketch  concepts  for  possibly  enhancing  the  development  of  more  realistic  models  of 
information  mediation.  As  a  transition  into  such  discussions,  figure  2  shows  a  schematic  that  is 
more  representative  of  the  real  world. 


4.  Types  of  Processing 


Why  use  parallel  or  distributed  processing,  as  opposed  to  one  large  processor?  Some  intuitive 
responses  present  themselves.  Bandwidth  constraints  will  generally  preclude  the  latter  approach. 
Access  time  for  desired  information  tends  to  increase  with  system  size  (given  that  the 
information  is  in  the  local  database).  The  probability  of  obtaining  desired  infonnation  tends  to 
increase  with  time  and  with  system  size.  A  large  system  may  eliminate  redundant  information 
more  readily.  Since  processors,  in  some  sense,  detennine  relationships  among  inputs  we  seek  to 
compare  the  advantages  of  notional  configurations  in  figure  3. 

The  nature  of  our  processing  is  that  information  is  extracted  from  real-time  input  and  combined 
with  data  from  a  database  of  recent  extractions.  Rules  must  be  developed  for  database  fonnation 
and  combination.  We  distinguish  a  processor  using  only  a  near-real-time  buffer  from  one  having 
access  to  an  accumulated  database. 

Assume  information  takes  the  fonn  of  n  units  (undefined  data  points,  or  information-theoretical 
bits)  of  utilizable  data  per  m  observations.  When  these  observations  are  processed,  in  a  sense 
they  emerge  stretched  or  compressed.  Infonnation  loss  may  occur  if  the  processor  is  overloaded. 
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Figure  2.  A  more  sophisticated  information-processing  network. 

A  main  derived  result  is  reduction  in  rawness,  that  is,  increase  in  utility.  The  next  level  has 
available  n  units  per  m*  observations.  Outputs  may  have  different  utilities  when  applied  to 
different  processors;  some  notion  of  utility  matrix  must  be  developed. 

Processing  may  involve  simplifying  information  from  one  source  (e.g.,  smoothing  data  as  with  a 
filter)  or  consolidating  information  from  several  sources  (e.g.,  averaging  positions  of  targets). 
Processing  may  also  involve  separating  types  of  information  (e.g.,  observations  by  a  single 
sensor  may  be  split  into  information  about  two  units  for  shipment  to  separate  higher-level 
processors).  The  point  is  that  a  processor  may  be  thought  of  as  a  function.  For  instance,  we 
consider  flows  like  in  figure  4,  in  which  Roman  letters  represent  functions,  Greek  letters 
represent  fractions,  and  +  and  *  represent  operators. 
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Figure  3.  Two  notional  configurations. 


Figure  4.  A  functional  processing  flow. 


Speculating  further  on  the  development  of  first-order  and  second-order  facts,  we  might  develop 
m ! 


f 


\nJ 


derivative  facts  from  m  raw  data  points,  given  that  n  facts  are  required  to 


n\(m  —  n)\ 

generate  one  derivative.  The  rate  of  first-order  fact  development  depends  on  input  rate  and 
amount  being  processed,  e.g.,  it  may  be  directly  proportional  to  input  rate  and  increase  to  an 
asymptote  with  the  amount  being  processed.  If  n  first-order  facts  are  needed  to  yield  one 
second-order  fact  but  some  are  unavailable,  then  querying  other  processors  might  fill  in  the  gaps 
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at  a  rate  proportional  to  the  amount  of  second-order  facts.  However,  we  must  account  for 
overlap. 

Note  also  that  staleness  can  be  accounted  for  in  terms  of  processor  load.  Suppose  the  rate  in  is  / 
and  the  rate  out  is  F.  If  /  =  F,  staleness  should  equal  0.  If  I  <F,  then  in  some  sense  staleness 
should  also  equal  0  (the  system  is  waiting  for  enough  input  to  yield  output).  We  could,  therefore, 
say  / <  F  =>  F  =  I.  If  / >  F,  we  have  build-up  in  the  buffer  going  as  \'0(l  -  F)  dt. 

We  can  initially  define  the  amount  of  information  in  the  database  as  the  time  integral  of  rate  of 
raw  information  input.  For  example,  if  the  input  rate  is  a  constant  k ,  we  have  D(t)  =  J[,  kdt  =  kt . 

By  formation  of  facts,  raw  information  might  be  cleared  out,  for  shipment  to  the  next  processor 
level  or  for  storage  in  a  higher-level-fact  database. 

Utility  accumulates  similarly,  but  we  must  account  for  decay  with  time.  We  can  develop  (via 
discrete  constant  input)  expressions  such  as 

D{t )  =  Jq  kadt  -  Jo  J'  kddtds  (2) 

and 

D(t)  =  Jo  kadt  -  Jo  J'  kd  (l  -  e~')dtds  ,  (3) 

based  on  constant  and  exponential  decay,  respectively.  In  general,  we  may  write 

D(t)  =  Jo  f{t)dt-  Jo  J'  d(t)dtds  (4) 


for  utility  in  the  base. 

As  a  processor  at  a  fixed  level  develops  facts,  partial  information  is  formed.  Some  will  be  filled 
by  new  input,  some  will  be  filled  by  answers  from  querying  other  processors,  and  some  will  be 
unfillable.  Of  the  completed  and  partial  information  in  a  processor,  some  will  be  information 
that  other  processors  require,  so  we  must  model  the  waiting  time  for  completion  of  a  partial  fact. 

Since  the  processing  network  comprises  a  connected  set  of  nodes,  we  can  represent  flow  from 
node  i  to  node  j  by  a  I  //-entry  in  a  square  matrix  of  zeros  and  ones.  With  this  representation,  the 
nth  power  of  the  matrix  yields  entries  with  the  total  number  of  n-stage  paths  from  node  /  to  j. 
Further,  we  can  consider  real  or  fractional  entries  as  representing  the  amount  or  fraction  of 
information  shipped.  Some  representation  of  hierarchy  can  be  produced  if  for  each  pair  /  ^  j,  we 
have  a ^  =  1  if  ap  =  0 — there  need  be  no  transitivity.  A  matrix  of  transmission 
capabilities/capacities  may  be  necessary  if  there  is  to  be  treatment  of  redistributing  inputs  to 
avoid  low  throughput. 

A  variety  of  functional  forms  could  represent  the  output  rate  of  a  processor  versus  input  rate; 
e.g.,  linear  increase,  monotonic  increase  to  asymptote,  increase  then  decrease.  This  last  form 
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may  be  particularly  useful  as  it  reflects  the  intuitive  notion  that  a  certain  threshold  amount  of 
data  is  required  for  generating  higher-order  facts  but  that  too  much  clogs  the  processor. 

A  differential  fonnulation  simpler  than  the  one  developed  previously  involves 


dF  dl  dS  dF 

—  =  a - pF, —  =  y - ASf, 

dt  dt  dt  dt 


where  F  and  S  are  first-  and  second-order  facts,  —  is  given  input  rate,  and  there  is  no 

dt 


consideration  of  querying  or  degradation. 
Both  these  equations  have  the  form 


f'  =  ag-Pf , 


with  solution 


/  = 


a 


J  ge^'dt  +  c 


Suppose  g(t)  is  a  constant  k.  Then, 


/  = 


ak  |  e^'dt  +  c  ak 

7  ~J 


(5) 


(6) 

(7) 

(8) 


Assuming/  =  0  at  t  =  0  yields 

ak[ept  -l) 

-p? 


(9) 


Since  /  -»  —  as  t  — >  °°,  we  have  an  example  of  self-limiting  behavior.  Given  g  =  k ,  we  would 

like  the  processing  to  stabilize  or  max  out  with  raw  data  being  lost  (or  accumulated  in  the 
database  for  subsequent  processing  with  decayed  utility). 

We  need  to  represent  the  fact  that  when  raw  data  are  combined,  they  are  removed  (or  databased). 

Suppose  input  arrives  at  a  rate  —  =  k  ,  is  combined,  and  leaves  at  a  rate  can,  where  m  is  the 

dt 

amount  combined.  Clearly,  buildup  occurs  if  can  <  k.  The  rate  of  change  of  amount  of 
information  in  the  processor  is 


dP  dl 

—  = - am  . 

dt  dt 


(10) 


(In  saying  that  m  raw  facts  are  combined  at  some  rate  to  yield  one  higher-order  fact  we  assume 
that  the  m  are  available.)  Thus, 
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P(t )  =  I  (t)  -  amt  +  c  , 


(ID 


and,  if  —  =  k  and  P(0)  =  0 ,  then 
dt 

P(t)  =  (k-am)t .  (12) 

We  may  represent  staleness  or  decay  of  information  in  the  processor  by  the  amount  of  time 
information  spends  in  the  buffer. 

We  have  avoided  mention  of  the  content  of  facts  being  processed,  and  from  an  abstract 
information  theoretical  point  of  view,  this  simplifies  matters.  However,  for  complex  systems,  we 
must  consider  heterogeneous  data  (e.g.,  separate  processing  of  sensings  of  vehicles  and 
observations  of  individuals).  Considering  context  is  more  difficult  yet.  Extracted  or  derived 
information  is  a  more  optimal  coding  (in  an  information  theoretic  sense),  and  it  contains  only 
part  of  the  source  information.  For  example,  given  two  positions,  extracted  data  could  be 
exemplified  by  the  distance  between  them. 

In  maximizing  utility  output,  we  must  consider  that  information  becomes  stale  if  it  moves  slowly 
through  the  system.  We  want  an  expression  for  the  decay  of  utility  at  a  fixed  level  of  processing. 
Intuitively,  if  we  set  utility  to  1  at  time  0,  then  at  times  close  to  0,  utility  should  be  close  to  1 . 
Utility  should  decay  to  near  0  in  finite  time  and  approach  0  asymptotically.  Another 
complication  arises  when  considering  that  the  decay  rate  of  a  high-order  fact  may  differ  from 
those  of  corresponding  low-order  facts.  Information  decay  is  a  difficult  theoretical  problem 
based  partly  on  the  nature  of  the  process  generating  the  facts.  Decay  involving  utility  of  a  low- 
order  fact  by  a  low-  or  high-order  processor  may  be  more  straightforward.  We  must  also 
consider  that  permanent  facts  (forming  a  fundamental  basis  for  the  processing,  as  opposed  to 
ephemeral  or  temporarily-useful  facts)  may  be  thought  of  as  having  zero  decay.  Moreover, 
decay  as  related  to  utility  may  be  nonmonotonic,  even  highly  so  in  certain  tactical  situations. 


5.  Utility  and  Value  of  Information 


Several  important  ideas  involve  the  concept  of  information  value.  We  will  attempt  to  use 
theoretical  concepts  of  information  rate  and  information  content  to  develop  this  concept. 
Intuitively,  pragmatic  content  of  (battlespace)  information  will  generally  depend  on  time,  and 
processing  may  increase  pragmatic  information.  Information  value  depends  on  the  level  of 
application.  Moreover,  utility  or  quality  of  information  will  generally  be  related  to  staleness  and 
to  the  amounts  of  information  coming  from  different  sources. 

One  conventional  definition  of  information  is  simply  a  numerical  measure  of  the  uncertainty  of 
an  experimental  outcome.  This  suffices  for  our  purposes — we  can  say  that  information  has  value 
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to  the  extent  it  is  useful  in  changing  an  outcome.  Of  course,  this  is  a  difficult  proposition  that 
precipitates  considerations  of  potential  use  and  of  measuring  or  predicting  outcome  change. 

We  must  clarify  the  basic  terms  information  content  and  utility  and  construct  a  mathematical 
model  by  which  we  can  speak  of  information  being  produced  and  transmitted.  We  receive 
information  when  informed  of  an  event  whose  occurrence  was  uncertain.  Utility  carries  with  it 
this  notion,  as  well  as  one  of  applicability,  which  generally  refers  to  appropriateness  of  the  data 
to  a  given  processor  level  (e.g.,  knowledge  that  a  certain  BCT  is  under  attack  may  not  be 
important  at  Corps).  Usefulness  generally  involves  closeness  to  the  present  situation  (e.g.,  that  a 
vehicle  is  now  at  some  position  may  be  of  low  usefulness  in  2  days). 

Information  can  be  thought  of  as  just  a  tool,  an  item  of  neutral  value  without  regard  to  the 
circumstances  to  which  it  is  being  applied.  Another  way  to  consider  information  value  is 
analogously  to  the  value  of  a  test — the  difference  between  the  expected  gain  of  a  process  if  the 
test  is  conducted  and  the  expected  gain  if  no  test  is  run. 

Such  economic  metaphors  are  pervasive.  We  can  analyze  information  overload  in  terms  of 
marginal  cost  vs.  marginal  utility.  Another  economic  approach  is  to  consider  that  information  is 
worth  what  people  are  willing  to  pay  for  it — it  has  no  inherent  value  except  in  some  context.  A 
trivial  example  is  that  knowledge  of  a  state  capital  may  be  worth  a  million  dollars  on  a  game 
show  but  could  have  no  value  in  a  military  situation.  Indeed,  in  the  military  situation,  it  could 
conceivably  have  negative  value,  say,  in  the  sense  of  distraction.  One  avenue  of  investigation 
involves  the  notion  of  value  being  expressed  as  the  product  of  utility  (of  a  fact,  like  as  a  tool  in 
achieving  a  purpose)  and  benefit  (of  that  purpose,  like  in  accomplishing  a  mission),  with  both  of 
these  attributes  subject  to  experimental  evaluation. 

A  matrix  fonnulation  can  be  postulated  in  terms  of  contexts  C  and  pieces  of  information  I  — the 
contextual  value  is  K. .  (We  are  at  this  point  purposefully  vague  about  units  of  information  and 

facts.  We  intend  to  tie  these  notions  into  mechanisms  involving  symbolic  propagation  of 
potential  utility  or  numerical  propagation  analogously  to  Bayesian  nets.)  As  an  example,  let  us 
consider  the  value  of  gasoline.  If  we  need  to  operate  a  gasoline  engine,  it  may  have  great  value. 
If  we  need  to  extinguish  a  lire,  it  may  have,  at  most,  zero  value.  Similarly,  water  has  little  value 
for  the  engine  and  some  value  for  the  fire  context.  So  it  is  for  information. 

One  way  to  think  of  utility  is  to  presume  that  information  has  value  only  with  respect  to 
complete  knowledge.  Suppose  total  knowledge  of  a  sector  is  represented  by  one  observation 
from  each  of  n  subsectors.  We  might  consider  parallel  presentation  as  yielding  the  linear  utility 
u  =  (subsectors  reporting)//?,  but  this  does  not  reflect  synergism  or  decay.  Another  approach  is  to 
compute  expected  utility  as  the  product  of  utility  and  probability  of  the  application.  We  can 
consider  the  contribution  of  various  pieces  of  information  to  the  value  of  the  outcome. 

Assuming  truth  (droppable  in  more  realistic  analyses)  and  that  redundant  information  contributes 
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nothing,  compressive  processing  should  yield,  when  compared  to  input,  a  lower  rate  of 
observations  each  of  higher  absolute  utility. 

Let  the  utility  of  a  raw  data  point  be  unity  at  the  lowest  level.  Then,  if  m,  nonoverlapping  (/-l  l- 
level  observations  are  required  to  generate  one  fact  at  level  /,  we  could  take  as  the  value  of  a  raw 
point  for  level  2  m~ 1 ,  for  level  3  (m2  m3)’1,  etc.  Of  course,  this  notion  holds  only  as  an  average, 

attributable  to  justifiable  concerns  over  the  number  of  higher-level  facts  that  may  actually  result 
from  a  single  raw  data  point. 

Information-theoretic  considerations  may  help  in  conceptualizing  change  of  utility  as  a  result  of 
changing  levels  or  of  processing.  A  unary  experiment  contains  log2(2  outcomes)  =  1  bit  of 
information,  and  an  experiment  with  n  outcomes  yields  log?/;  bits.  Although  processing  can 
compress  information  or  yield  additional  facts,  whether  it  can  increase  the  amount  of  preexisting 
information  depends  on  the  frame  of  reference.  We  are  interested  in  questions  like  ‘when  a 
first-level  processor  yields  one  second-level  fact  from  m  first-order  facts,  what  happens  to  the 
absolute  utility?’  and  ‘can  it  be  said  that  a  system  trades  off  absolute  utility  for  speed?’.  We 
want  to  reflect  the  idea  that  some  facts  are  more  useful  than  others.  Also,  if  we  use  explicit 
character  sets,  a  decrease/increase  in  the  size  of  the  set  requires  longer/shorter  sequences  to 
contain  the  same  information. 


Suppose  node  /  supplies  information  at  a  rate  F,{t)  to  some  higher-level  node.  Then  the  total 
input  rate  at  that  node  is  Z  F,.  Now,  if  the  rates  change  to  A*  the  relative  quality  could  be 

ZA* 

expressed  as  ^  ^  .  It  seems  plausible  that  several  detailed  observations  would  have  greater 


utility  to  a  higher  level  together  after  processing  than  they  would  as  separate  observations. 
Utility,  at  least  in  the  battlespace,  may  be  considered  a  decreasing  function  of  (increasing)  level 
and  time.  We  need  a  method  for  representing  composite  utility  of  several  facts  (at  several 
stalenesses).  Note  that  utility  per  se  is  independent  of  amount. 


6.  Second-Order  Facts 


Given  a  (constant,  initially)  database,  how  does  a  processor  develop  second-order  facts? 
Intuitively,  if  decay  is  excluded,  all  possible  facts  should  be  developed  (asymptotically)  in  a 
manner  depending  on  the  processor  and  amount  of  information.  The  amount  of  data  has  the 
positive  attribute  of  making  more  information  available  for  completion  and,  if  large,  the  negative 
attribute  of  tending  to  overload  the  system. 

We  examine  some  methods  of  combination,  but  note  that  one  approach  is  simply  to  assume  a 
function  with  certain  reasonable  properties.  For  example,  more  facts  should  arise  from  more 
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information,  and  no  second-order  facts  arise  from  one  raw  fact.  It  is  probably  reasonable  to 
consider  that  output  rate  is  a  function  only  of  the  amount  in  the  processor. 

Suppose  there  are  n  tanks  in  a  unit.  One  observed  tank  might  be  considered  to  yield  1  In  amount 
of  information  about  the  unit.  Given  two  observations  of  tanks,  several  situations  are  possible: 
the  tanks  are  different,  yielding  2 In  information,  or  the  tanks  are  the  same,  yielding  1  In. 
Extending  this  for  m  observations,  a  weighted  average  arises,  considering  that  a  random  sample 
with  replacement  has  probability  of  no  repetition 

n(n-l)...(n-m  + 1) 


Thus,  for  two  observations,  we  have  the  probability  of  separate  tanks  as 

77(77- 1)  n- 1 

2  _ 
n  n 

and  of  the  same  tank  — ,  so  the  information  could  be  thought  of  as  having  value 

77 

77-1  2  1  1  _  277  -  1 

77  77  77  77  77" 


(14) 


(15) 


With  regard  to  missing  information,  suppose  in  facts  are  needed  for  a  higher-level  fact.  Then  for 


n  >  m  facts  available, 


77  ! 


m\(n  —  m)\ 

conceptual  model,  we  might  think  in  terms  of 


higher-level  facts  are  inherent.  Toward  bounding  the 

nl 


[777-^]!  [77  -  (777  -  £)] 


potential  facts  with  k  <  m 


observations  missing.  Considerations  such  as  these  may  yield  thoughts  about  changes  in 
knowledge  brought  about  by  adding  observations  when  a  certain  number  have  already  been 
received,  leading  to  proper  queries  to  other  processors.  If  missing  information  is  related  to  a 
sector,  it  may  be  inversely  proportional  to  the  amount  received.  We  must  be  cautious  about 
dealing  with  the  probability  of  a  requested  fact  filling  in  knowledge  gaps;  the  content  of  the 
information,  not  just  the  amount,  may  have  to  be  considered. 


We  need  theoretical  justification  for  combining  units  in  this  manner.  One  approach  is  simply  to 
consider  that  one  fact  is  associated  with  each  subset  of  a  database:  the  number  of  second-order 
facts  associated  with  n  information  points  is  2".  With  this  formulation,  one  fact  (absence  of 
information)  is  associated  with  the  empty  set. 


However  facts  are  developed,  we  should  consider  basic  words  (undefined  terms),  basic  sentences 
(axioms),  and  logical  rules  in  examing  the  ways  sequences  of  situations  develop  in  time.  We 
should  be  able  to  represent  that  phenomena  comprise  others.  Any  second-order  development 
calculus  must  account  for  interactions  among  amount,  utility,  and  content.  For  instance,  not  all 
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facts  generated  by  some  simple  rule  of  thumb  will  be  useful  or  even  meaningful.  Perhaps 
function  fitting  to  input/output  of  real-world  processors  would  assist  in  developing  the  theory. 

As  mentioned,  multiple  observations  might  be  more  easily  stripped  out  of  the  system  at  higher 
levels.  Another  sort  of  redundancy  involves  observation  or  querying  for  information  derivable 
from  existing  facts.  An  essential  part  of  developing  an  optimal  network  is  understanding  how 
lower-level  facts  might  be  processed  directly  at  a  higher  level.  Such  facts  would  increase  the 
amount  in  a  higher-level  buffer  similarly  to  the  contribution  of  regular  facts,  but  may  be 
processed  less  efficiently — if  a  processor  is  not  intended  to  process  relatively  raw  information, 
that  information  has  lower  relative  utility.  It  could  be  argued  that  if  a  processor  must  perform 
low-level  functions  to  obtain  its  regular  data,  this  would  diminish  other  processing.  On  the  other 
hand,  when  low-level  facts  are  considered  at  this  level,  a  different  processing  methodology  might 
exist  to  use  them  directly.  In  any  event,  database  development  and  querying  would  take  different 
forms.  Possibly  the  same  amount  of  data  could  be  processed  regardless  of  its  intended  level,  but 
low-level  input  would  have  less  utility  at  a  higher  level.  We  will  have  to  consider  using 
high-order  data  by  low-order  processors,  particularly  in  a  network  that  permits  inter-level 
querying. 


Consider  the  generated  utility  of  a  processor-buffer.  Assume  that  data  come  into  the  buffer  with 
two  attributes:  amount  (as  time-rate  of  input)  and  utility  (also  a  function  of  time,  but 
independent  of  amount).  Assume  the  value  of  a  generated  fact  at  the  next  level  is  simply  a 
multiple  of  the  average  utility  in  the  buffer  at  the  time  of  generation.  Thus,  if  we  do  not  remove 
facts  or  consider  utility  decay,  we  can  write  average  utility  as 


A(t) 


\‘0  r [t)u ( t)dt 

K  r(t)dt 


(16) 


where  the  buffer  is  initially  empty,  r(t)  is  the  input  rate,  and  u(t)  is  utility. 

Note  that  using  average  utility  in  generating  higher-order  facts  in  effect  assumes  instantaneous 
access.  It  may  be  argued  that  the  actual  buffer  should  comprise  data  only  within  some 
time-window.  We  may  be  able  to  leverage  methods  for  analysis  of  computer  algorithms  to 
incorporate  more  realistic  access  as  part  of  processing  cost.  There  should  be  additional 
examination  of  the  presumption  that  a  set  of  n  facts  of  1  In  utility  each  is  in  some  sense  equivalent 
to  one  fact  of  unity  utility.  We  desire  consistency  between  utility-decay  of  separate  and 
consolidated  facts.  We  must  distinguish  among  utility  as  a  function  of  level,  utility  at  a  fixed 
level  as  a  function  of  time,  and  information  content.  For  instance,  can  we  say  that  n  facts  of 

CM 

average  utility  a  equal  one  higher-order  fact  of  utility  — ,  where  m  data  points  comprise  one 

m 

higher-order  fact?  We  need  to  reflect  that  a  higher-order  fact  may  be  of  greater  utility  at  its  level 
than  the  component  data  points,  even  though  in  a  sense  there  may  be  less  information. 
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7.  Removing  Data  From  the  Buffer 


For  a  dynamic  battlespace,  facts  would  be  removed  from  the  buffer  as  the  situation  changed  and 
their  utilities  diminished.  Removal  after  utility  reaches  some  non-zero  minimum  may  be 
considered,  but  we  first  consider  removal  after  some  time  tm  following  entry  into  the  buffer.  We 
can  write 

^(t)  =  r(t)-r(t  +  tm)  (17) 

as  the  rate  of  change  of  amount  of  data  in  the  buffer  (with  no  decay).  Thus,  for 

0  <t<tm,  (18) 

B(t)  =  \'or(t)dt.  (19) 

Afterwards, 

B(t)  =  \‘tm  r(t)dt-\rtm  r(t  +  tm)dt  (20) 


or 

B[t)  =  \\  r  {t')dt-\tQtm  r  [t)dt . 

Thus,  we  have 

j'  r(t)u(t)dt r(t)u(t)dt 
j'  r(t)dt-\'~‘m  r{t)dt 


(21) 

(22) 


as  the  average  utility  of  facts  in  the  buffer  at  t  >  tm  .  This  (appropriately  multiplied)  could  be 
used  as  u(t)  for  the  next  level. 

Many  mechanisms  exist  for  clearing  buffers.  For  instance,  with  regard  to  context,  it  may  be  that 
for  a  small  group  of  related  facts  simply  no  more  inferences  can  be  drawn.  Also  with  regard  to 
overload  problems,  we  may  want  to  place  facts  into  a  database  or  shunt  them  to  another 
processor,  possibly  at  a  different  level. 

Looking  now  at  the  buffer  in  terms  of  the  time  a  unit  of  information  spends  there,  we  see  that 
given  a  beginning  amount  B(tb )  =  Bh,  input  rate  r,(t),  and  output  rate  r0(t),  the  amount  in  the 
buffer  at  some  ending  time  is  Bh  +  j''  [r  (5)  -  r  (5)] ds.  Therefore,  the  average  time  spent  in  the 

buffer  is 
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(23) 


k  K + K  [ri  ( ■ * )  -  ro  ( ■ s )] ds] { dt 
h{Bb+KVri(s)~ro(sy\ds}dt 

This  may  then  be  converted  to  utility.  Of  course,  r,  and  rQ  may  be  modeled  as  functions  of 
parameters  other  than  time:  e.g.,  r„(B).  Such  dependencies  yield  more  interwoven  formulations 
of  average  time  and  utility.  We  may  be  more  interested  in  distributions  of  times  or  utilities  at  the 
expense  of  complicating  the  representation. 

Consider  utility  decay  with  time,  initially  without  removal.  A  differential  amount  of  data  r(0)dt 
enters  at  t  =  0  with  utility  w(0).  The  information  then  decays  with  rate  v(u(0),t) ,  and  utility  of 
the  differential  amount  after  time  x  could  be  written  w(0)  —  v(w(0),s)tA  .  Similarly,  for  data 

entering  the  buffer  at  any  time  t  the  utility  at  time  x  is  u(t)  -  \xt  v(u(t),  s)ds  .  Therefore,  one 
measure  of  the  product  of  amount  and  utility  in  the  buffer  at  time  x  is  jx0  r(t)\xtv  ( u(t),s )  dsdt . 

Moreover,  we  could  reasonably  remove  data  from  the  buffer  based  on  actual  utility  rather  than 
time.  This  must  be  pursued  for  realistic  analyses. 


8.  Processing  Rates 


Consider  the  time  a  unit  of  information  takes  to  traverse  a  dual-level  processor.  The  unit  arrives 

d 

at  the  first  buffer,  with  capacity  a,  at  time  t0  and  departs  at  t0  -t — ,  which  we  consider  also  as 

m 

instantaneous  arrival  at  the  second  buffer,  with  capacity  b.  The  unit  leaves  the  b  buffer  similarly. 
We  wish  to  examine  a(t)  and  b(t),  assuming  a(t0)  =  a0  and  b(t„)  =  b0. 


In  subsequent  evaluations,  we  consider  compressed  facts  vs.  original  facts  with  regard  to 
information,  keeping  in  mind  that  processing  rate  is  independent  of  the  number/ of  first-order 
facts  comprising  a  second-order  fact.  When  the  unit  arrives  at  b,  we  have 


b  =  b-n 


t ,  H — — 


V 


m  ) 


m 

+ - 

/ 


v 


a„ 

to+~l 

m  ) 


(24) 


the  original  amount  minus  output  during  time  of  interest  plus  input  during  time  of  interest.  The 
time  of  departure  of  the  unit  from  b  reduces  to 


f 

v 


m  ) 


Hl+^L 
nf  n 


(25) 


Denoting  this  by  t*  and  letting  tQ  =  0,  we  have 

bof  +  a, 

fn 


(26) 
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Note  this  time  is  independent  of  the  ^-processing  rate.  Further,  t*  is  inversely  proportional  to  the 
/^-processing  rate,  and  writing 

t*  =  ^s.  +  So.  (27) 

n  fn 

shows  it  to  be  roughly  inversely  proportional  to  the  comprisal  number  f  Thus,  the  throughput 
time  of  a  unit  in  this  situation  can  be  reduced  by  increasing  n  or  f. 

Now  consider  decay  of  first-order  information  moving  through  a  constant-rate  processor.  With 
“decay  parameter”  a,  we  may  write  the  value  of  a  data  unit  as  u(t)  =  e  .  When  this  datum 

aan 

d  .  - - 

clears  the  ^-processor  at  time  t0  +  — ,  it  has  value,  e  m  ,  assuming  t0  =  0.  This  can  be  a 

m 

prototype  example  of  utility,  staleness,  and  amount  of  information  while  passing  through  a  series 
of  buffers/processors. 

Assume  a  processor  has  some  maximum  buffer  size.  When  the  input  and  processing  rates  would 
cause  it  to  be  exceeded,  one  has  several  model  options.  One  is  sloughing  input:  resetting  to  a 
value  (e.g.,  0)  that  allows  processing  or  to  some  maximum,  either  absolute  or  situation- 
dependent.  Another  is  redirection  to  another  processor  at  the  same  level.  Another,  more 
sophisticated,  solution  is  to  remove  buffered  facts  before  their  normal  expiration.  Finally,  the 
incoming  data  could  be  shunted  to  a  higher-level  processor. 

It  might  be  argued  that  output  (at  least  a  higher-level  processor)  should  be  time-shifted,  with  a 
corresponding  decay  in  utility,  to  reflect  non-real-time  processing.  Possibly  another  way  to 
represent  lag  is  to  consider  that  the  buffer  used  in  generating  higher-order  facts  involves  data 
arriving  only  within  some  time  window. 


9.  Subtasks,  Stress,  and  Degradation 


Assume  an  information-processing  task  is  a  series  of  n  nonoverlapping  simple  subtasks,  where 
the  ith  subtask  normally  takes  time  x  to  complete.  Imposing  some  degradation,  initially 
time-independent,  define  the  degradation  factor  di  as  the  ratio  of  degraded  completion  time  z\  to 

n 

nonnal  completion  time.  Then,  the  degraded  completion  time  of  the  task  is  .  Further,  let 

i=l 

degradation  be  a  constant  dj  for  a  fraction  s .  of  a  subtask  (where  Z  s  =  1 ),  and  let  x  denote 
undegraded  completion  time.  Then  degraded  z  =  z'Zd.s. . 

j 

Note  that  the  strenuousness  of  a  task  might  be  reflected  by  a  stress  function.  For  example,  a  light 
task  could  be  represented  by y  =  klt(k,  large),  a  moderately-difficult  task  could  be  represented 
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by  y  =  kmt  (km  medium)  or  y  =  km(\  -e  m'"')  (m,n  large),  and  a  strenuous  task  could  be  represented 
by  y  =  ks  (!--')  ( ms  small).  An  overall  stress  function  probably  exhibits  time-dependence. 

Intuitively,  efficiency  could  be  described  by  expressions  like  /(/)  =  1  -£/,/(/)  =  k(t  + 1)”1 ,  and 
/ (/)  =  e~kt .  (Section  12  of  this  note  considers  the  notion  of  efficiency  in  a  somewhat  different 
manner.)  The  reciprocal  of  efficiency  could  be  considered  a  (time)  degradation  function: 
d(t)  =  [f(x)Y' .  In  general,  if  efficiency  is  /  (/)  for  t  e\t\,  t{\,  we  might  havex'  =  xj^  \j'{x)\x  dx. 

Consider  a  task  in  terms  of  the  amount  y(t)  of  processing  accomplished  over  time.  Consider 
an  efficiency  function  /!/)  portraying  stress  on  the  processor  in  terms  of  relative  ability  to 
perform  the  task.  One  might  then  propose  this  model  for  the  degraded  processing  function: 

y' {*)= It*  }{*)/{*)& •  (28) 

The  following  are  two  examples: 

y  =  kt,f  =  emt,  /0  =  0=>  v’  =  —  (l  —  e~mt )  (29) 

m 


and 


y  =  kt,f  =  1  -mt , 


(.-./  =  I/m  /  =  0),/0  =0^>y’  = 


y  (2  ~mt). 


(30) 


We  now  we  have  an  expression  for  the  processing  accomplished  under  ordinary  and  degraded 
conditions.  We  would  like  a  transfonn  such  that,  given  ordinary  time  t  to  complete  y  units 
of  processing,  we  can  compute  degraded  time  /  ’to  accomplish  y  units.  We  try 

y=g{t)=>t'=g-1{y'),  (3i) 


where  g  1  denotes  the  inverse  function.  Then,  the  first  example  yields 


t'  =  —  —  In 
m 


( 


my 


1- 

v  k 


and  the  second  yields 

1  L  2m/ V 

r=— ( — 

m 


(32) 


(33) 


Consider  a  subtask  in  terms  of  the  fraction  s(t )  accomplished.  In  an  unstressed  environment  we 
might  have,  normalizing  for  /  e  [0,x],  s(t)  =  T  '/ .  Assume  a  stressing  effect 


ds(t) 

dt 


=  fit ), 


(34) 
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that  is, 


s{t)  =  \^f(x)dx. 


(35) 


Again,  in  an  unstressed  environment 

ds{t)  _  j 
dt 

and 

s(t)  =  T~lt 


(36) 


(37) 


as  t  goes  from  0  to  T . 

Accomplishing  a  fraction  s(t*)  that  would  ordinarily  be  accomplished  by  time  t  will  take  with 

rt*  .  * 

degradation  time  J  d(x)dx  .  Clearly,  for  efficiency  f{t)  =  1  the  degraded  time  is  t  .  Under 
unstressed  conditions,  we  have 


(38) 


Under  stressed  conditions,  the  fraction  accomplished  by  t  can  be  expressed  as 

Jo  /(*)  dx 


(39) 


Jo  Kx)  dx 

We  can  now  answer  the  question  ‘how  much  will  be  accomplished  in  an  interval  of  length  t*T  by 

C'o+'r 

f  f(x)dx 

J‘o 


s'(t*)  =  ^ 

v  ’  tu 


r  ta+r 

f  f(x)dx 

J'o 


(40) 


where  to  is  clock  time  of  subtask  initiation.  The  question  ‘how  long  will  it  take  to  accomplish  a 
fraction  '  '  of  the  task?’  involves  the  more  complicated  expression 

F(t0+t*)  =  s  +  T)  +  [1  -  5  \t*)]F(t0  ) ,  (41) 

where  F(x)  =  j* / (x)dx .  The  (generally  non-closed-fonn)  solution  for  t  depends  on  /(/). 

Given  efficiency  as  a  function  of  time  and  time  for  undegraded  completion,  what  is  the  time  for 
completion  with  overall  degradation?  That  is,  what  is  the  transformation  of  the  subtask 

n 

summation  X<7,U?  One  might  say,  letting  t,  denote  clock  time  of  the  initiation  of  the  /th  subtask, 

i= 1 
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(42) 


X  diTi  j],+l  d(x)dx  . 

But  how  is  t+1  determined?  Given  that  transformed  x\  =  is  it  true  that 

T\H=T\diTil,;+ld(x)dxt?  (43) 

The  time  to  accomplish  a  given  fraction  of  a  subtask  is  stretched  by  the  overall  stress  function  in 
such  a  way  that  (letting  tM  =  /,  +  At ): 

At'  =  d  ( 0 )  At,  t[  =  1 0 + At'; 

At'  =  d  ( t,')  At,  t'  =  t ,  +At' ; 

‘  (44) 

=  (  tg  +  At]  )  +  d  (  tg  +  At]  ]  At 

=  |^tg  +  d  (0)  At]  +  d  |]tg  +  d  (0)  At]  At ; 

and  so  forth.  We  need  a  better  mapping  of  the  fraction  of  subtask  and  task  accomplished  to 
account  for  this  time  modification. 

Suppose  we  have  the  task  of  performing  n  subtasks  simultaneously.  What  is  a  reasonable  model 
of  task  degradation  d,  given  c/,  for  each  subtask  /?  If  the  subtasks  are  independent,  perhaps 
max 

{d, }  would  do;  if  they  interact,  we  try  a  different  approach. 
i 

Consider  a  matrix  Anxn  =  [hr  ] .  where  ay  represents  the  effect  that  subtask  i  has  on  subtask  j:  if 

i  is  degraded  one  unit  then  j  is  degraded  ay  units.  (We  postpone  briefly  a  discussion  of  what  is 
meant  by  degraded  one  unit.)  Let  us  now  impose  degradation  in  the  form  of  an  //-dimensional 
vector  x  =  [jci,  ay  ...  xn\  and  write  the  interacting  degradation  as  xA.  This  would  seem  to 
indicate  that,  for  example,  subtask  1  is  degraded  xi  units  due  to  subtask  1  (since,  clearly,  A’ s 
diagonal  consists  of  ones),  plus  <221  *2  units  due  to  subtask  2,  plus  <231  X3  units  by  task  3,  and  so 
on.  Then,  for  the  overall  task,  perhaps  the  weighted  sum  w(xA)1  might  represent  the  overall 

n 

degradation,  where  w  =  [w\,  W2  ...  wn\  and  X  vy  =  1,  depending  on  importance  of  subtasks. 

Let  us  now  return  to  degraded  one  unit.  We  have  discussed  the  degradation  factor  d  =  x'tx . 
Consider  this  example: 

[1,2]  *  °  =[1,2],  (45) 

with  w,  =  0.2  and  w2  =  0.8 .  Here,  we  have  overall  degradation  0.2  +  1.6  =  1.8.  Given  that 

subtask  2,  degraded,  takes  twice  as  long  as  normal,  is  it  reasonable  that  the  overall  task  takes 
1.8x  as  long?  Apparently,  overall  degradation  must  mean  something  different  in  this 
simultaneous  formulation. 
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10.  Accuracy 


Intuitively,  one  might  expect  accuracy  to  decrease  with  time  and  task  difficulty.  Can  we  account 
for  the  effects  of  decreased  accuracy  due  to  degradation?  Perhaps  inaccuracy  in  one  subtask 
affects  all  subsequent  subtasks.  Inaccuracy  in  a  component  subtask  probably  affects  the  whole 
task.  Inaccuracy  (perhaps  defined  as  measured  deviation  from  normal  at  a  given  fraction  of 
subtask  completion)  could  be  treated  analogously  to  time  dilation.  Increased  error  frequency  is 
probably  a  discretization  of  inaccuracy. 

Another  way  to  think  of  accuracy  is  in  terms  of  a  time-varying  Poisson  distribution.  The 
invariant  Poisson  distribution 

Xxerx 

p(x)  =  — r~  (46) 

X . 

gives  the  probability  x  errors  will  occur  in  some  time  interval,  under  the  following  assumptions. 
Consider  a  fixed  unit  of  time  T  in  which  errors  may  occur.  We  assume  errors  occur 
independently  and  that  for  a  short  At,  the  probability  of  one  error  is  proportionate  to  the  length  of 

k 

the  interval,  i.e.,  equals  —At ,  with  k  constant  during  T.  We  also  assume  the  probability  of  two 

or  more  errors  during  At  is  negligible.  Then  it  can  be  shown  that  the  earlier  distribution  is 
k  k 

obtained,  where  l  =  —t,  and  t  is  the  interval  length.  Here,  —  is  the  number  of  errors  per  unit 

time,  as  derived  empirically  in  an  unstressed  environment  over  the  period  T.  For  a  time-varying 
k 

situation,  —  and  X  are  functions  of  time. 

T 


Suppose 


k 

—  =  CT  , 

T 


t  measuring  clock  time  from  the  beginning  of  stress.  A  formal  substitution  yields 


P(x)=z,rX(e'c,)T  ■ 


However,  we  are  really  interested  in  an  interval  [r0,  r0  +  /]  for  which  k  is  a  function  of  r  .  Can 
k  k 

we  use  for  —  an  average  of  —  over  the  interval?  Formal  substitution  into  the  time-invariant 
T  T 

equation  of  the  expression  h(r)dt ,  where  H(t)  is  an  error-rate  function,  yields 
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(49) 


Here  is  another  instance  in  which  plotting  specific  distributions,  along  with  time-invariant  — , 
may  yield  additional  insights. 

11.  Efficiency  and  Compounding 


Let  us  now  write  the  efficiency  of  task  performance  as 


E  = 


w 

w  +  R 


(50) 


where  w  is  time  spent  actually  processing  and  R  is  idle  time.  It  is  apparent  that,  in  general,  the 
overall  efficiency  of  a  task  comprising  a  (nonoverlapping)  series  of  subtasks  is  not  the  sum  of  the 
subefficiencies  total 


,  _  Zvf  ^ 

r  =  2>,+2>, 


(51) 


Although  two  sets  of  efficiencies  generated  by  different  times  can  yield  the  same  total  efficiency 


(e-g., 


n  +  n  m  +  m 


), 


(52) 


this  is  not  generally  true.  Consider  that 


f?  = 


(53) 


now 


W!+W2 


,  .  I  Wi  (1-C) 

(w,+w2)  +  ' 


+  fL 


^1+^2 

(Wj  /  E)  +  w2+R2 


(54) 


a  function  of  w\.  That  is,  total  efficiency  depends  on  specific  subtask  times. 

Suppose  we  write  degraded  efficiency 

E  _  D„w 
“  Dww+DrR’ 

where  degradation  factors  Dw  and  Dr  are  greater  than  or  equal  to  unity.  It  turns  out  that 


(55) 


22 


Dw  +  DR 


However,  the  fact  that  R  =  0  when  E  =  I  flaws  this  expression  for  Ed,  since  DrR  =  0  when  E  =  1 . 


Suppose  we  try 


Dww+(R  +  Rd) 


where  Rd  is  additional  idle  time  in  the  degraded  mode.  Now,  this  cannot  lead  to  an  expression 
purely  in  terms  of  E;  the  best  we  can  do  is  something  like 


where  T  =  w  +  R  ,  or 


*(A,-i)+r+V 


dwet 

T[DwE  +  (1-E)]  +  Rl 


Some  simplification  results  by  letting  T  =  1  then, 


'  D,E  +  (\-E)  +  R[ 

However,  this  formulation  also  suffers,  as  exemplified  by 


E  =  ——  =  .  5, 
.5 +  .5 


E  _  (jjM  _  ; 

d  (1.5)(.5)+(.5  +  .25) 

(increasing  idle  time  and  processing  time  yields  the  same  efficiency).  Previously,  we  had 


Dww+(R  +  Rd ) 

In  an  unstressed  situation,  Ed  =  E.  For  Dw  =  1 ,  we  have 


>  +  (R  +  Rd) 


doubling  idle  time  yields 
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(65) 


Ed  _  w  +  R 


E  w  +  2  R 

a  seemingly  acceptable  ratio.  So,  with  this  formulation, 

p  et 

Ed  = 


r[D,.£+(l-£)]  +  ^  ■ 


(66) 


Let  us  return  to  the  original  expression 


Ed  = 


D 


Dw  +  DR 


1  -E 


v 


(67) 


where  R  >  0 .  In  an  unstressed  situation,  Ed  =  E.  For  Dw  =  1,  we  have 

E  =  w  • 

d  w  +  DrR  ’ 


(68) 


doubling  idle  time  again  yields  the  seemingly  acceptable  ratio 


Ed  w  +  R 


E  w  +  2R 

Let  us,  as  before,  take  the  numerator  of  the  first  Ed  expression  to  be  w;  we  can  then  derive 

1 


(69) 


K=- 


Dw  +  DR 


1  -E 


(70) 


where  E  +  0  and  R  +  ()  =>  E  +  1.  With  this  formulation,  Ed  — >  0  as  E  — >  0  and 

Ed  -+  — *—  as  E  — >  1 .  Plots  of  the  two  expressions  for  Ed  based  on  several  values  of  Dw,  DR,  and 

Rd  would  be  a  useful  illustration. 

Let  us  continue  with  our  analysis  of  the  expression 


E,  = 


w 


D,.,w  +  DrR 


(71) 


We  have  from 


Dw  +  DR 


1  -E 


(72) 


that 


24 


(73) 


dE:,  _  Dr _ 

dE  [, EDW+DR(1-E )J' 

Since  this  is  always  greater  than  0,  Ej  increases  monotonically  with  E.  Also,  it  can  be  shown 
that  the  second  derivative  is  always  positive — Ed  as  a  function  of  E  is  concave  upward,  and  the 

J7 

whole  curve  lies  above  the  line  Ed  =  — .  This  model  implies  that  degrading  lower  efficiency  is 

Er 

always  worse  than  degrading  higher  efficiency — tasks  which  are  performed  well  to  begin  with 
are  more  resilient. 

What  does 


E“  Dww+DrR 

mean  with  regard  to  completion  time?  What  does  multiplying  Ed  by  a  constant  mean 
operationally?  We  have  from 

Td  =  Dww  +  DrR  = 

Ed 


(74) 


(75) 


that  completion  time  for  what  is  normally  accomplished  in  w  (undegraded  processing  time)  is 
inversely  proportional  to  Ed.  Also,  since 


T  = 


w 

E 


and 


we  have 


T  = 

1D 


(76) 


(77) 


(78) 


so  that  apparently  undegraded  efficiency  is  necessary  to  compute  degraded  completion  time. 
Ordinarily  for  a  subtask,  we  have 


w. 


iv, .  +  R. 


(79) 


Suppose  we  consider  subtasks  performed  simultaneously  until  the  total  task  is  accomplished. 
Suppose  attention  is  divided  among  subtasks  according  to  factors  such  that  X OCt=\.  Then  vv; 
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might  be  transformed  to  —  .  With  regard  to  idle  time,  we  might  assume  a  linear  relationship 

between  (transformed)  w,  and  R,  until  proven  otherwise.  But  in  this  context,  what  is  meant  by 
combined  idle  time?  Perhaps  m^X  {  R,  j  would  be  a  reasonable  fonnulation. 

Perhaps  an  analog  to  the  electrical  resistance  law 

RT=-iT  (80) 

x-'  A 


could  be  used  for  compounding  efficiencies,  since  in  a  sense,  the  reciprocal  of  efficiency  is  a 
measure  of  resistance  to  task  completion.  The  formulations 


Et 


r\_' 

yEij 


(81) 


and 


1 


T  1 

\EiJ 


(82) 


are  flawed,  since  total  efficiency  greater  than  unity  is  possible.  Trying 

1 


Et  =  ■ 


1 


yields 


Et 


E\E2 

E\  +E2 


(83) 


(84) 


for  the  two-subtask  case,  and  for  n  subtasks  of  equal  E,  we  have 

Er=~,  (85) 

n 

so  this  analog  may  offer  possibilities.  Of  course,  it  may  be  argued  that  our  basic  concept  of 
compounded  efficiency  is  too  simplistic,  if  only  because  it  does  not  consider  nonindependence  of 
sub  tasks. 
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Considering  that  0  resistance  connotes  E  =  1  and  infinite  resistance  E  =  0,  we  might  try 
resistance  in  accordance  with 


Looking  at 


again,  we  would  have 


1  -Et 

Et 


I 


1 

~JL 

1  -E, 


(86) 


(87) 


(88) 


this  works  for  the  single  subtask,  but  cannot  be  used  with  Et=  1 .  For  n  subtasks  of  equal  E,  we 
have 


1  -Et 
Et 


1 

~E~ 

n - 

1  -E 


=>  ET 


nE 

\  +  {n  —  \)E 


(89) 


It  is  apparent  this  yields  increasing  values  of  Et  with  increasing  n.  Let  us  try 


\  +  (n-l)E  ’ 


(90) 


where  we  justify  the  division  by  n  as  an  n- way  splitting  of  attention. 
We  have  from 


1  -Et  _  1 

Et  y  Et 

1  ~Et 


that 


E 


Et  =  ■ 


1  -E, 


l  +  Z 


1  -E, 


(91) 


(92) 


For  equal  attention  factors  we  may  divide  this  by  n;  but,  we  need  to  account  for  nonequal 
attention  factors,  so  consider  the  expression 
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Z  i n 

L l-EJn 


(93) 


E 


Et  = 


n  -  E, 


l  +  Z  E‘ln  l  +  Z^^’ 

1  -EJ  n  n  -  E; 


which  for  equal  efficiencies  simplifies  to 


nE 


i  +  (n  —  \)E 


Generalizing  gives 


v  aA 
l-alEi 


l  +  Z 


a,E< 

\-aiEi 


(94) 


However,  this  formulation  still  suffers  from  a  need  to  compensate  for  values  of  ET  greater  than 
Ej.  Also,  note  all  this  derivation  is  without  considering  degradation.  Investigations  into 
analyzing  subtask  adaptation  and  compounding  are  essential  to  further  modeling. 


12.  A  Final  Example 


As  an  exercise  in  setting  up  a  throughput  analysis  (one  that  helps  shape  the  conclusion),  consider 
the  processing  situation  in  figure  5,  where  the  first  processor  is  characterized  by 

\sXt)  =  pa^(t) 

1  a\t)  =  f\t)-mg\t) 


and  the  second  by 


lh'(t)  =  qbv(t) 

\b\t)  =  g\t)-nh\t) 


We  have 


a  =  f  —  mg , 


(96) 


(97) 


so 


g'  =  P(f  ~mg)M . 


The  simplifications  of  assuming  a  constant  driver 


f\t)  =  k. 


(98) 

(99) 
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jU  =  1, 

(100) 

and 

g(0)  =  0 

(101) 

yield 

g-  [( mpt  l)  +  ep]. 

m  p  L  J 

(102) 

Continuing,  we  have 

b  =  g-nh , 

(103) 

and  assuming  v  =  1  gives 

h'  =  q(g-nh) . 

(104) 

With  the  just-obtained  solution  for  g  we  can  solve  for 


h  =  enqt 


aqenqt-pt  +aj8{nqt-l)enqt 
nq  -  /?  n2  q 


aenqr 

n 


(105) 


where  a  =  — - —  and  / 3  =  mp.  Proceeding  in  this  manner,  a  (cumbersome)  expression  for  h(t)  can 
nr  p 

be  derived.  Visualization  techniques  will  certainly  have  applications  in  analyses  of  semi-realistic 
networks,  and  simulation  is  seen  to  be  useful,  if  not  essential,  as  network  complexity  increases. 

As  a  processing  function  that  involves  more  self-limiting  behavior,  reconsider  the  situation 

a'(t)  =  f'-a(  1-Aff).  (106) 

Then,  for  constant  / '  =  k ,  we  have 

—  =  (k-a)  +  ae~a ,  (107) 
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which  implies 


a-  In 


^exp  \jjc-a){t  +  c)-a^  ^ 
k-a 


(108) 


For  a  (0)  =  0,  we  find 


ln(k  -  a)  +  a 
k-a 


(109) 


A  modification  of  this  problem  that  would  build  on  the  discussion  of  use  of  low-order  facts  by 
high-order  processors  can  be  illustrated  by  figure  6. 


The  circle  indicates  a  decision  concerning  shunting  of  low-order  facts  to  the  separate  processor 
levels.  The  decision  must  be  optimized  within  some  constraints  (e.g.,  binary  switch,  fractional 
separation  as  a  function  of  input  rate).  A  rewording  of  part  of  our  overall  problem  is:  what 
parameter  values  of  qualitative  relationships  make  one  level  preferable  to  another? 


13.  Closing  Thoughts 


Obviously,  this  work  is  mostly  abstract  development,  but  the  intent  is  to  allow  general 
application  to  diverse  data.  Areas  for  theoretical  investigation  include  the  following: 

•  calculus  of  variations 

•  catastrophe  theory 

•  cellular  automata 

•  coding  theory 

•  control  theory 

•  cybernetics 
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•  decision  theory 

•  fractals 

•  fuzzy  sets 

•  game  theory 

•  infonnation  theory 

•  mathematical  programming  (in  particular  dynamic  programming) 

•  measure  theory 

•  networks  (their  topology  and  geometry) 

•  queuing  theory 

•  scheduling 

•  stochastic  processes 

•  systems  theory 

We  intend  to  synergize  with  U.S.  Army  Research  Laboratory  (ARL)  efforts  on  computerized 
simulation  of  data  fusion  networks  and  self-organizing  sensor  communications. 

Many  opportunities  exist  for  theoretical  and  applied  work.  Is  it  possible,  under  certain  situations, 
to  develop  a  transfonn  of  rate  in  to  utility  out?  The  basic  problem  of  time-tagging  information 
for  decay  as  it  moves  through  the  system  is  challenging.  For  instance,  if  it  takes  a  certain  amount 
to  generate  output,  what  is  the  time  a  unit  of  information  spends  in  the  processor?  For  query 
consideration,  multiple  input/output  ports  should  be  modeled.  Expression  of  degradation  via 
differential  equations  may  alleviate  certain  difficulties.  If  task  accomplishment  y(t)  is  the 

desired  solution,  we  could  consider  the  amount,  rather  than  fraction,  of  task  completed.  We  are 

also  interested  in  processor  rate  as  an  input  and  in  solving  for  —  to  yield  maximum 

dt 

accomplishment  or  minimum  time.  It  is  probable  t  may  not  be  the  only  independent  variable, 
e.g.,  some  environmental  factor  0  may  enter  as  a  parameter  in  the  stress  function,  perhaps 

g(t)  =  kxehef(t).  (HO) 

We  are  also  interested  in  the  state  of  the  processor  performing  a  task,  e.g.,  in  order  to  remove  the 
processor  from  action  before  damage  can  occur.  Although  probably  associated  with  overall 
degradation,  we  might  want  state  characteristic  to  be  a  function  of  processing  rate — one 
formulation  is 


Also,  accuracy  and  completion  time  are  probably  associated  with  state  characteristic. 

An  eventual  goal  is  a  self-organizing  cognitive  software  system  encoding  knowledge  as 
multidimensional  threads  and  maximizing  information  storage  and  access  via  automatic 
transformations  of  interim  associations.  With  a  system  of  graph-theoretical  nodes  (objects)  and 
arcs  (relationships),  information  can  be  recalled  through  association — when  one  representation  is 
activated,  even  with  a  fragmented  pattern,  so  are  others  with  common  nodes.  The  effort  entails 
topological,  semantic,  and  set-theoretical  aspects.  The  knowledge  structure,  dynamically 
recon ligurable,  would  process  facts,  rules,  and  relations  among  database  information  with  query- 
resolution  algorithms.  Bidirectional  traversal  of  arcs  could  be  based  on  attributes,  quantification, 
negation,  context,  synonyms/commonalities,  supersets,  and  applicable  processes. 

Models  of  knowledge  processing  are  needed  for  intelligent  control.  It  is  not  clear  how  to  obtain 
a  network  minimizing  nodal  distance,  separate  knowledge  bases  for  information-routing  and  the 
application  per  se,  account  heuristically  for  expected  queries,  suspend  actual  computation  as  the 
system  reconfigures  structures,  or  resolve  centralized/distributed  control  trade-offs.  Information 
decay  is  an  interesting  theoretical  problem  (e.g.,  decay  rate  of  a  high-order  fact  may  differ  from 
that  of  corresponding  low-order  facts). 

Several  problems  are  associated  with  developing  information  structures  and  techniques  for 
efficient  query  processing.  Perhaps  rules  approximating  local  answers  can  minimize  data 
transferred  among  nodes.  Future  research  can  explore  structuring  techniques  less  restrictive  than 
subnetting  and  include  (tactically  important)  vague  knowledge.  Indeed,  the  Advanced  Decision 
Architectures  Collaborative  Technology  Alliance  (CTA)  evaluation  panel  recommended 
increasing  research  into  computational  models  of  conceptualizations.  Investigations  into 
detecting  deception,  measures  of  deviation  from  a  plan,  and  text  retrieval  all  indicate  a  fruitful 
area  of  exploration  involves  representing  information  as  vectors  in  a  space  of  basis  concepts. 

Continued  research  is  intended  to  complement  ongoing  experiments  and  modeling  efforts.  The 
work  benefits  information  science  and  technology  in  areas  like  fusion,  incremental  databasing, 
and  parallel  query  resolution.  It  supports  the  ARL  mission  via  fundamental  research  to  provide 
the  U.S.  Anny  necessary  analytical  support.  In  particular,  it  addresses  our  infonnation 
technology  goals  of  analysis  and  assimilation  to  help  reduce  the  commander’s  uncertainty.  It 
furthers  ARL’s  strategic  plan  by  focusing  on  key  areas  of  digitization  and  communications 
science  and  by  its  intent  to  utilize  the  larger  CTA  organization  and  results.  This  research  toward 
merging  information  theory  with  control  theory  may  yield  opportunities  for  upgraded  or  new 
commercial  systems  as  well  as  for  battle  command  over  the  tactical  internet. 
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